Improving pattern discovery relevancy by deriving constraints from expert models

نویسندگان

  • Frédéric Flouvat
  • Jérémy Sanhes
  • Claude Pasquier
  • Nazha Selmaoui-Folcher
  • Jean-François Boulicaut
چکیده

To support knowledge discovery from data, many pattern mining techniques have been proposed. One of the bottlenecks for their dissemination is the number of computed patterns that appear to be either trivial or uninteresting with respect to available knowledge. Integration of domain knowledge in constraint-based data mining is limited. Relevant patterns still miss because methods partly fail in assessing their subjective interestingness. However, in practice, we often have in the literature mathematical models defined by experts based on their domain knowledge. We propose here to exploit such models to derive constraints that can be used during the data mining phase to improve both pattern relevancy and computational efficiency. Even though the approach is generic, it is illustrated on pattern set discovery from real data for studying soil erosion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relevancy in Constraint-Based Subgroup Discovery

This chapter investigates subgroup discovery as a task of constraint-based mining of local patterns, aimed at describing groups of individuals with unusual distributional characteristics with respect to the property of interest. The chapter provides a novel interpretation of relevancy constraints and their use for feature filtering, introduces relevancy-based mechanisms for handling unknown val...

متن کامل

Data Peeler: Contraint-Based Closed Pattern Mining in n-ary Relations

Set pattern discovery from binary relations has been extensively studied during the last decade. In particular, many complete and efficient algorithms which extract frequent closed sets are now available. Generalizing such a task to n-ary relations (n ≥ 2) appears as a timely challenge. It may be important for many applications, e.g., when adding the time dimension to the popular objects × feat...

متن کامل

Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery

Big geospatial data are archived and made available through online web discovery and access. However, finding the right data for scientific research and application development is still a challenge. This paper aims to improve the data discovery by mining the user knowledge from log files. Specifically, user web session reconstruction is focused upon in this paper as a critical step for extracti...

متن کامل

Discovery of Syllabic Percussion Patterns in Tabla Solo Recordings

We address the unexplored problem of percussion pattern discovery in Indian art music. Percussion in Indian art music uses onomatopoeic oralmnemonic syllables for the transmission of repertoire and technique. This is utilized for the task of percussion pattern discovery from audio recordings. From a parallel corpus of audio and expert curated scores for 38 tabla solo recordings, we use the scor...

متن کامل

Enhancing Process Mining Results using Domain Knowledge

Process discovery algorithms typically aim at discovering process models from event logs. Most discovery algorithms discover the model based on an event log, without allowing the domain expert to influence the discovery approach in any way. However, the user may have certain domain expertise which should be exploited to create a better process model. In this paper, we address this issue of inco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014